AITopics | field pair

Collaborating Authors

field pair

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

An Augmentation Strategy for Visually Rich Documents

Xie, Jing, Wendt, James B., Zhou, Yichao, Ebner, Seth, Tata, Sandeep

arXiv.org Artificial IntelligenceDec-22-2022

Many business workflows require extracting important fields from form-like documents (e.g. bank statements, bills of lading, purchase orders, etc.). Recent techniques for automating this task work well only when trained with large datasets. In this work we propose a novel data augmentation technique to improve performance when training data is scarce, e.g. 10-250 documents. Our technique, which we call FieldSwap, works by swapping out the key phrases of a source field with the key phrases of a target field to generate new synthetic examples of the target field for use in training. We demonstrate that this approach can yield 1-7 F1 point improvements in extraction performance.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2212.10047

Country:

Europe (1.00)
North America > United States > Montana > Roosevelt County (0.46)

Genre: Research Report (0.50)

Industry: Energy > Oil & Gas > Upstream (0.56)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

AutoAttention: Automatic Field Pair Selection for Attention in User Behavior Modeling

Zheng, Zuowu, Gao, Xiaofeng, Pan, Junwei, Luo, Qi, Chen, Guihai, Liu, Dapeng, Jiang, Jie

arXiv.org Artificial IntelligenceOct-26-2022

In Click-through rate (CTR) prediction models, a user's interest is usually represented as a fixed-length vector based on her history behaviors. Recently, several methods are proposed to learn an attentive weight for each user behavior and conduct weighted sum pooling. However, these methods only manually select several fields from the target item side as the query to interact with the behaviors, neglecting the other target item fields, as well as user and context fields. Directly including all these fields in the attention may introduce noise and deteriorate the performance. In this paper, we propose a novel model named AutoAttention, which includes all item/user/context side fields as the query, and assigns a learnable weight for each field pair between behavior fields and query fields. Pruning on these field pairs via these learnable weights lead to automatic field pair selection, so as to identify and remove noisy field pairs. Though including more fields, the computation cost of AutoAttention is still low due to using a simple attention function and field pair selection. Extensive experiments on the public dataset and Tencent's production dataset demonstrate the effectiveness of the proposed approach.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2210.15154

Country:

Asia > China > Shanghai > Shanghai (0.04)
Asia > China > Shandong Province (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Genre: Research Report > Promising Solution (0.34)

Industry: Information Technology > Services (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Field-Embedded Factorization Machines for Click-through rate prediction

Pande, Harshit

arXiv.org Machine LearningSep-13-2020

Click-through rate (CTR) prediction models are common in many online applications such as digital advertising and recommender systems. Field-Aware Factorization Machine (FFM) and Field-weighted Factorization Machine (FwFM) are state-of-the-art among the shallow models for CTR prediction. Recently, many deep learning-based models have also been proposed. Among deeper models, DeepFM, xDeepFM, AutoInt+, and FiBiNet are state-of-the-art models. The deeper models combine a core architectural component, which learns explicit feature interactions, with a deep neural network (DNN) component. We propose a novel shallow Field-Embedded Factorization Machine (FEFM) and its deep counterpart Deep Field-Embedded Factorization Machine (DeepFEFM). FEFM learns symmetric matrix embeddings for each field pair along with the usual single vector embeddings for each feature. FEFM has significantly lower model complexity than FFM and roughly the same complexity as FwFM. FEFM also has insightful mathematical properties about important fields and field interactions. DeepFEFM combines the FEFM interaction vectors learned by the FEFM component with a DNN and is thus able to learn higher order interactions. We conducted comprehensive experiments over a wide range of hyperparameters on two large publicly available real-world datasets. When comparing test AUC and log loss, the results show that FEFM and DeepFEFM outperform the existing state-of-the-art shallow and deep models for CTR prediction tasks.

artificial intelligence, factorization machine, machine learning, (16 more...)

arXiv.org Machine Learning

2009.09931

Genre: Research Report > New Finding (0.66)

Industry: Marketing (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

NEMA: Automatic Integration of Large Network Management Databases

Wu, Fubao, Song, Han Hee, Yin, Jiangtao, Gao, Lixin, Baldi, Mario, Anand, Narendra

arXiv.org Artificial IntelligenceJun-1-2020

Network management, whether for malfunction analysis, failure prediction, performance monitoring and improvement, generally involves large amounts of data from different sources. To effectively integrate and manage these sources, automatically finding semantic matches among their schemas or ontologies is crucial. Existing approaches on database matching mainly fall into two categories. One focuses on the schema-level matching based on schema properties such as field names, data types, constraints and schema structures. Network management databases contain massive tables (e.g., network products, incidents, security alert and logs) from different departments and groups with nonuniform field names and schema characteristics. It is not reliable to match them by those schema properties. The other category is based on the instance-level matching using general string similarity techniques, which are not applicable for the matching of large network management databases. In this paper, we develop a matching technique for large NEtwork MAnagement databases (NEMA) deploying instance-level matching for effective data integration and connection. We design matching metrics and scores for both numerical and non-numerical fields and propose algorithms for matching these fields. The effectiveness and efficiency of NEMA are evaluated by conducting experiments based on ground truth field pairs in large network management databases. Our measurement on large databases with 1,458 fields, each of which contains over 10 million records, reveals that the accuracies of NEMA are up to 95%. It achieves 2%-10% higher accuracy and 5x-14x speedup over baseline methods.

artificial intelligence, field pair, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2006.01294

Country: North America > United States (1.00)

Genre:

Research Report (0.50)
Personal (0.46)

Industry: Information Technology > Networks (0.67)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (0.66)
Information Technology > Communications > Web > Semantic Web (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Predicting Different Types of Conversions with Multi-Task Learning in Online Advertising

Pan, Junwei, Mao, Yizhi, Ruiz, Alfonso Lobos, Sun, Yu, Flores, Aaron

arXiv.org Machine LearningJul-24-2019

Conversion prediction plays an important role in online advertising since Cost-Per-Action (CPA) has become one of the primary campaign performance objectives in the industry. Unlike click prediction, conversions have different types in nature, and each type may be associated with different decisive factors. In this paper, we formulate conversion prediction as a multi-task learning problem, so that the prediction models for different types of conversions can be learned together. These models share feature representations, but have their specific parameters, providing the benefit of information-sharing across all tasks. We then propose Multi-Task Field-weighted Factorization Machine (MT-FwFM) to solve these tasks jointly. Our experiment results show that, compared with two state-of-the-art models, MT-FwFM improve the AUC by 0.74% and 0.84% on two conversion types, and the weighted AUC across all conversion types is also improved by 0.50%.

artificial intelligence, conversion type, machine learning, (17 more...)

arXiv.org Machine Learning

doi: 10.1145/3292500.3330783

1907.10235

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.34)

Industry:

Marketing (1.00)
Information Technology > Services (0.86)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Field-weighted Factorization Machines for Click-Through Rate Prediction in Display Advertising

Pan, Junwei, Xu, Jian, Ruiz, Alfonso Lobos, Zhao, Wenliang, Pan, Shengjun, Sun, Yu, Lu, Quan

arXiv.org Machine LearningJun-9-2018

Click-through rate (CTR) prediction is a critical task in online display advertising. The data involved in CTR prediction are typically multi-field categorical data, i.e., every feature is categorical and belongs to one and only one field. One of the interesting characteristics of such data is that features from one field often interact differently with features from different other fields. Recently, Field-aware Factorization Machines (FFMs) have been among the best performing models for CTR prediction by explicitly modeling such difference. However, the number of parameters in FFMs is in the order of feature number times field number, which is unacceptable in the real-world production systems. In this paper, we propose Field-weighted Factorization Machines (FwFMs) to model the different feature interactions between different fields in a much more memory-efficient way. Our experimental evaluations show that FwFMs can achieve competitive prediction performance with only as few as 4% parameters of FFMs. When using the same number of parameters, FwFMs can bring 0.92% and 0.47% AUC lift over FFMs on two real CTR prediction data sets.

artificial intelligence, interaction strength, machine learning, (13 more...)

arXiv.org Machine Learning

1806.03514

Country:

Europe > France > Auvergne-Rhône-Alpes > Lyon > Lyon (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Marketing (1.00)
Information Technology > Services (1.00)

Technology:

Information Technology > Data Science (0.94)
Information Technology > Communications (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback